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A MANUAL FOR CODING STEROIDS 


INTRODUCTION 


The system devised by the Office of Research 
and Development (R & D) of the U. S. Patent Office 
for the mechanized searching of steroid compounds 
is described in R & D Report No. 7 entitled A 
Punched Card System for Searching Steroid Com- 
pounds. 


SCOPE OF THE ART IN THE SYSTEM 


The system is limited to the steroid art. The 
patents included in the system are those classified 
in Class 260, Subclasses 239.5, 239.55, 239.57, 
397; 397.1, 397.2, 397.25, 397.3, 397.4, 397.45, 
397.47, and 397.5.2 

Only those steroids disclosed in the patents which 
meet the definitions of the above Class and Sub- 
classes have been included in the system.3 

The seco and homo steroids are excluded. Also 
excluded are steroids classified elsewhere ac- 
cording to the Rule of Superiority in classification.4 


GENERAL CODING PRINCIPLES 


Every compound which is coded must contain the 
steroid nucleus shown in Figure 1. A fixed number- 
ing system is assigned to this nucleus (see Figure 
4), the fixed numbers serving to identify the po- 
sitional locations of nuclear substitution. 


Figure 1 


Coding of the patents is done on a one-page form 
(front and back) on which the terms representing 
information to be coded with their corresponding 
codes are assembled. The top side of the form is 
designated “2A” and is shown in Figure 2. The 
reverse side is designated “2B” and is shown in 
Figure 3. The terms are called “descriptors.” 
See pages 6 and 7. 


Composite Coding 


Composite coding is described in R & D Report 
No. 7. Briefly, a group of formulas of related 
compounds are “composited” into a single syn- 
thetic formula to reduce the number of total codes 
per document. 


Multiple Coding 


The terms which have been selected as de- 
scriptors present a certain degree of overlap in 
concept. Thus a compound or group of compounds 
may be describable by more than one term. How- 
ever, because the terms represent various levels of 
specificity (or genericity), all terms which are ap- 
plicable to a particular configuration are used. 


Relationship of Codes and Punched-Card Format 


Each descriptor is defined by a single set of 
numbers designating a particular column and row 
of the standard 80-column IBM punched card. 
Conversely, each column-row location on the IBM 
card is a fixed allocated space for a particular de- 
scriptor. The card is divided into two fields: 
columns 1 through 59 constituting the field for 2A 
terms and columns 60 through 67 constituting the 
field for 2B terms, more fully described below. 


NUMBERING SYSTEM FOR THE NUCLEUS 


The following fixed numbering system is assigned 
to the steroid nucleus: 


2zI¢C 
| 
10 Ce 
18 & 
19 g coe 
14 
Figure 4 


Positional locations 18 through 22 are variables in 
the sense that they may notbepresent in a particu- 
lar compound. The locations exist only when they 
are occupied by carbon atoms. 


ORGANIZATIONAL ARRANGEMENT OF TERMS 


2A Terms—General Description 


The 2A terms relate to chemical configurations 
and substituents on the steroid nucleus (Figure 4). 
The codes for these terms have been devised to 
show the positional locations of the descriptors on 
the nucleus. Since there are 22 possible positional 
locations, two columns on the IBM punched card 
have been allocated to each 2A descriptor. Thus, 
the first two columns are assigned to each 2A de- 
scriptor. Thus, the first two columns are assigned 


oR 


Patent No...cceesceceseecee MXAMINCEF. ee ecceeerccccos eI UNICHECsisieiele ols'ecleie ele eclcie'ee 
Oo 
a az 123456789 10 11 12 13 14 15 16 17 18 19 20 21 
or allo Pa tas 6.785 10 1112.13 14 15 16.17 18 19 20 21 
-C=C. az Tage aan 10 11 12 13 14 15 16 17 18 19 20 21 
ery 1 11 12 13 14 15 16 17 18 19 20 21 
CN a2 123456789 10 11 12 13 14 15 16 17 18 19 20 21 
COOH or COOR PPP ies 611 2 ts 14 15 16 17 18 19 20 21 
-C- sub 22 125456789 1011 1215 14 15 16 17 18 19 20 
-H Org DAE) 10 11 12 13 14 15 16 17 18 19 20 21 
NH, or NC aA b29sso78s 10 11 12 13 14 15 16 17 18 19 20 21 
OH ba 223456789 10 11 12 13 14 15 16 17 18 19 20 21 
— Weitere: 10 11 12 13 44 15 16 17 18 19 20 21 
-Se- a2 223456789 PCE iatissie 17 18 10 soot 
-S-R eee oni i944 15 16 17 18 19 20 21 
Hal PCP toeians 14.15 16 17 18 19.20.21 
Hydrocarbon 22 12 3456789 10 11 12 13 14 15 16 17 18 19 20 21 
Ketal 2123456789 10 1112 19 14 15 16 17 18 19 20 21 
Ketone reagent 22 12 3456789 10 11 12 13 14 15 16 17 18 19 20 21 
IE poxy an POAS RAO 10 11 12 13 14 15 16 17 18 19 20 21 
-O hydrocarbon L i 3456789 11 12 13 14 15 16 17 18 19 20 21 
-O Acyl pe NAQASORAD TOLLCEUR TE EE Een rn 
-O-hetero 2a AD 3456789 ‘ae 12 13 14 15 16 17 18 19 20 21 
-N-hetero a3 4123456789 10 14 12 13 14 15 16 17 18 19 20 21 
B-hetero ‘ROA OURO i 11 12 13 14 15 16 17 18 19 20 21 
Miscellaneous 27123456789 10 11 12 13 14 15 16 17 18 19 20 21 

Figure 2 
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61 
0 O-Hetero N-Hetero 


Carboxylic 1 Morpholine Morpholine 
Poly 2 Furan Piperidine 
Unsat H Lactone Pyridine 


Aromatic | Spirostane Pyrimidine 


Aliphatic | Sub in O spiro Pyrrole 
ring 
Substi. | Psuedosapo. Thiazole 


St. chain 

Cycloalkyl | 

Branched 
Heterocyclic 


Inorganic 
except hal 


Bile cpds. | Sterols 
Acids | Ergosterol 
Cholanic Cholesterol 
Norcholanic Vitamin Dg 


Bisnorcholanic cl 
Double bond 


5 (6) 
5 (10) 
8 (9) 

8 (14) 


1 (2) 
1 (10) 


Figure 3 


S-Hetero 
Thiophene 


Thiazole 


Androstane 
Add. compd. 
Maleic adduct 


Pregnane 
21 Unsubsti- 
tuted 


21 Diazo 


to “——”, the next two columns to “c<or allo”, etc. 
(see Figure 2). Positional location 1 on the nucleus 
is coded in row 1 onthe IBMpunched card, location 
2 in row 2, etc. 

Figure 5 demonstrates the use of the columns and 
rows of the IBM card for the 2A term “==”. The 
code for “==” is tabulated for each of the 22 po- 
sitional locations on the nucleus. Figure 5 also 
shows the correspondence of the rows of the IBM 
card with the positional locations. 


CODE 


IBM Card it IBM Card 
Column Row 


Positional 
Location 


NoWe «Ss Mo. S10 ol ed 
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Figure 5 


Note: The first column of a pair of columns for a 
2A term is used for positional locations 1 to 9 and 
22; the second column is used for locations 10 to 
21. Rows 1 to 9 of the first column correspond to 
positional locations 0 to 9, respectively; row 0 cor- 
responds to location 22, Rows 0 to 9 of the second 
column correspond to positional locations 10to 19, 
respectively, row 11 to location 20, and row 12 to 
location 21. 


2B Terms—General Description 


The 2B terms are, witha few exceptions, broader 
and more generalized in scope. The terms in- 
corporate generic and subgeneric concepts related 
to the compound as a whole, e. g., “N-hetero,” 
and, as types of “N-hetero” compounds, “morpho- 
line,” “piperidine,” etc. The terms. are not as- 
sociated with a positional location. The column 


and row numbers allocated to these terms are shown 
in Figure 3, the column numbers being givenat the 
top of each group and the row numbers being listed 
thereunder. The 2B terms corresponding to the 
row numbers appear to the right of the numbers. 


Pregnanes and Androstanes 


When the steroid nucleus has a chain of 2 carbons 
in single or double bond linkage as in Figure 6, 


(e Cc 


| | 
c 


n= 


Figure 6 


the compound containing this structure is coded as 
a pregnane. If the 2 carbon chain is in triple bond 
linkage as in Figure 7, 


Figure 7 


the compound containing the structure is coded as 
an androstane, unless X represents the linkages 
shown in Figure 6. 

More generally, androstanes are structures 
which have a non-hydrocarbon substituent, a methyl 
group, a non-hydrocarbon substituted methyl group, 
or an ethinyl linkage in the 17 position. 

Structures containing groups characteristic of 
both the pregnanes and androstanes as defined 
above are coded as pregnanes only, i.e., the 
androstane structure is subordinated to the preg- 
nane. An example of such a structure is given in 


Figure 8. 
Cc 
. 


Figure 8 


Bes 


DEFINITIONS OF DESCRIPTORS 


The definitions which follow have not necessarily 
been assigned from the point of view of strict ad- 
herence to chemical principles but from the point of 
view of permitting retrieval of desired classes of 
steroid compounds. 


Definitions—2A Terms 


This symbol designates a double bond in the 
nucleus. In coding the positional location of a 
double bond, the lower number of the pair of po- 
sition number is indicated. Thus, for 44, 5 po- 
sition 4 is coded. Where any of positions 1, 5 
and 8 are involved, 2B terms are added to dis- 
tinguish between positions 1-2, 1-10, 5-6, 5-10, 
8-9, 8-14. 


ec or allo 


This term represents isomeric forms ofa struc- 
ture. In the case of the3, 5 or i position in Ring A, 
as in i-cholesterol, the codes are assigned as 
though two positional locations were occupied. 


Example of coding: 


qn) 


Figure 9 


Codes: Column 3 Row 3 
Column 3 Row 3 (Allo) 


-C—C- 
This is the ethinyl linkage. 


Example of coding: 
(a 17-ethinyl androstane) 


il 
E 


Figure 10 


Codes: Column 6 Row 7 (ethinyl) 
Column 67 Row I (androstane) 


CH; 

This represents a methyl group. Methyl groups 
which appear in positional locations 18, 19, 20, and 
21, when present, are considered as integral por- 
tions of the nucleus and are not coded as methyl 
substituents. When a methyl group is coded, the 
2A term hydrocarbon is also coded. 

Example of coding: 

(a 17-methyl pregnane) 


Figure 11 


In Figure 11, the methyl group in position 17 only 
(encircled) is coded as follows: 


Codes: Column 8 Row 7 (methyl group) 
Column 30 Row 7 (hydrocarbon) 
Column 67 Row 4 (pregnane) 


CN 
CN represents the nitrile group and may be 
coded with respect to any of the 22 positions. 


COOH or COOR 


COOH and COOR refer to carboxylic acids and 
their salts or esters, respectively, where the car- 
boxyl group is directly attached to the nucleus, and 
may be coded in any of the 22 positions. 


Example of coding: 
(a methyl ester of 17-carboxy androstane) 


COOCH3 


Figure 12 


Codes: Column 12 Row 7 (COOH or COOR) 
Column 67 Row 1 (androstane) 


-C-sub. 


This symbol represents a non-hydrocarbon sub- 
Stituent linked through a carbon to the steroid 
nucleus and not specifically provided for on the list. 


a—Aa 


Example of coding: 
(3-chloro methyl androstane) 


II 
Codes: Column 22 Row 11 (ketone) 
Column 67 Row 4 (pregnane) 
Figure 14—Con. 
Cc 


a 


cl Figure 13 -Se- 


This symbol represents selenium or substituted 
selenium where the selenium is directly attached 


Codes: Column 13 Row 3 (-C-sub.) to the steroid nucleus. 
Column 67 Row I (androstane) oe 


-H This symbol refers to sulfur or substituted sulfur 
This symbol is applied only when the hydrogen where the sulfur is directly attached to the steroid 

atom is present in the 10 or 13 position or both nucleus. 

to indicate hydrogen in place of Cyg or Cyg or 


both. Hal 
y, This symbol represents the halogen group. Halo- 
NH2 or N4 gens are coded in 2A with respect to positions. 


' This syinbol represents an amino or substituted Column 60, rows 0 to 4, are used to further de- 
amino group. The group is not applicablewhen the scribe the type of halogen. 


nitrogen atom is part of a nitro group ora Example of coding: 
/ heterocyclic ring. (a 4-chloro androstane) 
| OH 
| This represents the hydroxy group. 
i 
/ =O 
| 
; The =—O represents the ketone or aldehyde 
group. 
Examples of coding: 
Cl 
(0) Figure 15 : 
I 
C—H 3 
Codes: Column 27 Row 4 (halogen) 
Column 66 Row 0 (halogen containing, : 
broadly) E 
Column 66 Row 4 (chloro containing, - 
. broadly) a 
| Column 67 Row 1 (androstane) } 
Hydrocarbon - 
I This term represents hydrocarbon in any of the z 
Codes: eam 22 Row 11 (aldehyde) positions 1 to 22. Where the hydrocarbon group — 
‘olumn 67 Row 1 (androstane) is a methyl 18ora methyl 19, it is not coded. Also, 
in pregnanes, the chain of 2 carbons attached to — 
Figure 14 position 17 is not coded as a hydrocarbon. In — 
- 10 - 


cholesterols and other ‘sterols, the hydrocarbon 
chain in the 17, 20 and 22 positions is not coded as 
hydrocarbon. 


Ketal 


This designation refers to the reaction product 
resulting from the reaction of a keto or aldehyde 
group with an alcohol to give a ketal or cyclic 
ketal. The term “ketal” includes also the term 


“acetal.” 
oO } 


Nc7 


Example of coding: 
(a 3, 20-diethylene ketal pregnane) 


Cc 


oO 


oO Figure 16 


Codes: Column 31 Row 3 (ketal, location 3) 
Column 32 Row 11 (ketal, location 20) 
Column 61 Row 0 (O-hetero compound) 


The last code, 61-0, is included since the cyclic 
ketal is an oxygen-containing heterocycle. 


Ketone reagent 


This designation refers to a reaction product be- 
tween the ketone or aldehyde attached to any of the 
positions 1 to 22 with well-known ketone reagents, 
(carbazide, semi-carbazide, hydroxylamine, etc.) 


Epoxy 


This designation refers to an epoxy group at- 
tached to two nuclear carbon atoms. 


Example of coding: c 
(a 3,4-epoxy pregnane) 
Cc 
Figure 17 


Codes: Column 35 Row 3 (epoxy) 


Column 35 Row 4 aw Locations 3-4 
Column 61 Row 0 (O-hetero compound) 
Column 67 Row 4 (pregnane) 


-O hydrocarbon 


This. designation refers to ethers or to any hydro- 
carbon attached to the nucleus through an oxygen 
atom. 


Example of coding: 
(a 3-methoxy pregnane) 


Figure 18 


Codes: Column 37 Row 3 (-O hydrocarbon, 
location 3) 
Column 67 Row 4 (pregnane) 


-O acyl 

This designation refers to an ester group at- 
tached to the steroid through the -O- atom of the 
group. Note that -O acyl is further defined by the 
2B terms in column 60, rows 0 to 12. (See Figure 
3). 

Example of coding: 

(a 21-acetyl pregnane) 


oO 
(e}——> ou —CHg 


Cc 


Figure 19 


Codes: Column 40 Row 12 (-O acyl, location 21) 
Column 60 Rows 0, 1, 5, 7 (O acyl and 
subgeneric terms) 
Column 67 Row 4 (pregnane) 


-O-Hetero 


This designation refers to an oxygen-containing 
heterocyclic group. The heterocycle can be at- 


-ll- 


tached through any of the atoms of the heterocycle 
at any one of the 22 positional locations. 

Example of coding: 

(a 3-furyl pregnane) 


a—-a 


Figure 20 


Codes: Column 41 Row 3 (O-hetero, location 3) 
Column 61 Row 0 (O-hetero compound) 
Column 61 Row 2 (O-hetero, furan) 
Column 67 Row 4 (pregnane) 


N-hetero 


This designation refers to a nitrogen-containing 
heterocyclic group attached to any one of the | to 
22 positions through any of the atoms of the hetero- 
cyclic group. Codes from 2B terms are also ap- 
plied as in the following example. 


Example of coding: 
(a 3-pyrrole pregnane) 


(S— 


at 


N7 Figure 21 


Codes: Column 43 Row 3 (-N-hetero, location 3) 
Column 62 Row 0 (N-hetero compound) 
Column 62 Row 5 (N-hetero, pyrrole) 
Column 67 Row 4 (pregnane) 


S-hetero 


This designation refers to any sulfur-containing 
heterocyclic group attached to any of the 1 to 22 
positions of the steroid molecule through any of 


the atoms of the heterocycle. 
terms are also applied. 


Appropriate 2B 


Example of coding: 
(a 20-thiophene pregnane) 


] 
Cc o=—C 
x. 

Figure 22 


Codes: Column 46 Row 11 (-S-hetero, location 20) 
Column 63 Row 0 (S-hetero compound) 
Column 63 Row 1 (S-hetero, thiophene) 
Column 67 Row 4 (pregnane) 


Miscellaneous 


This designation is the catch-all for any groups 
which are attached to any of the 1 to 22 positions 
and are not provided for above. 


Definitions—2B Terms 


The terms in 2B are, with few exceptions, not 
related to positions of substitution on the steroid 
nucleus. Several of the terms overlap 2A terms 
in concept. However, these overlap terms provide 
additional description of the group represented by 
a 2A term. 

The 2B terms incorporate several levels of 
genericity. Indentations under any term provide 
further description or delineation of said term. 

The terms are discussed below in the order of 
their allocated spaces on the IBM punched card. 


Column 60 


0. O-Acyl.—An ester group attached to the 
steroid nucleus through the -O- atom. The de- 
scriptors indented under O-Acyl provide greater 
specificity as to the acid radical of the ester group. 

1. Carboxylic.—The carboxylic acid radical. 

2. Poly.—The polycarboxylic acid radical. 

3. Unsat.—An unsaturated acid radical. Aro- 
matic unsaturation is excluded. Thus, the benzoic 
acid group is not coded as anunsaturated acid rad- 
ical. 

4. Aromatic.—The carboxylic acid radical con- 
taining an aromatic hydrocarbon group. 

5. Aliphatic.—The aliphatic carboxylic acid rad- 
ical. 

6. Subst.—An aliphatic acid radical substituted 
by a non-hydrocarbon group. 


= jac 


7. St. chain.—A straight chain aliphatic carbox- 11. Heterocyclic.—A heterocyclic acid radical. 


ylic avid group with no substituents. 12. Inorganic—except hal.—Noncarboxylic inor- 
8. Cycloalkyl.-An aliphatic carboxylic acid ganic acid radicals except nalogen acid radicals. 
group containing a cycloalkyl group. Note.—By the multiple coding principle 
9. Branched.— An aliphatic carboxylicacidgroup (supra, p. 5), all applicable terms are applied to q 
containing a branched alkyl group. particular compound. 


Example of coding: 
(a 3-cyclohexyl formyl pregnane) 


Oo 
c—o 


Codes: Column 39 Row 3 (-O acyl, location 3) 
Column 60 Row 0 (O-acyl, compound) 
Column 60 Row 1 (O-acyl, carboxylic) 
Column 60 Row 5 (O-acyl, aliphatic) 
Column 60 Row 8 (O-acyl, cycloalkyl) 
Column 67 Row 4 (pregnane) 

Column 67 Row 5 (21-unsubstituted) 


Figure 23 


Column 61 2. Furan.—The O-heterocycle in furan configu- 
: ration inclusive of saturation and unsaturation. 
0. O-hetero.—An oxygen-containing heterocyclic 3. Lactone.—The O-heterocycle in lactone ar- 
group. rangment. 
1, Morpholine.—The O-heterocycle in morpho- 4. Spirostane.—The spirostane group normally 
line configuration. found in sapogenins shown in Figure 24. 
Oo 
O 
or 
I II 
Figure 24 
=f Ste 


5. Sub in O-spiro ring.—Spirostanes containing 
substituents in the oxygen-containing ring other 
than hydrocarbon substituents. 

6. Pseudosapo.—The pseudosapogenins are ace- 
tic anhydride or other anhydride derivatives of 
spirostanes or sapogenins. A pseudosapogenin is 
shown in Figure 25. 


Acyl 


Figure 25 


9. Misc.—O-hetero steroids not specifically pro- 
vided for by the terms above. 


Column 62 


0. N-hetero.—Nitrogen-containing heterocyclic 
group. 

1. Morpholine.—The morpholine group. (The 
morpholine group is coded both as an O-heteroand 
an N-hetero group). 

2. Piperidine.—The piperidine nucleus. 

3. Pyridine.—The pyridine group which includes 
the dihydro and tetrahydro forms. 

4. Pyrimidine.—The pyrimidine group which in- 
cludes the dihydro and tetrahydro forms. 

5. Pyrrole.—The pyrrole group which includes 
the saturated and unsaturated forms. 

6. Thiazole.—The thiazole group which includes 
the saturated and unsaturated forms. 

9. Misc.—Any nitrogen-containing heterocyclic 
groups not specifically provided for by the terms 
above. 


Column 63 


0. S-hetero.—Sulfur-containing heterocyclic 
groups. 

1. Thiophene.—The thiophene ring which includes 
the saturated and unsaturated forms. 

2. Thiazole.—The thiazole group. (The thiazole 
group is coded both as an N-heteroandan S-hetero 
group). 

9. Misc.—Any sulfur-containing heterocyclic 
prop not specifically provided for by the terms 

ve. 


Column 64 


0. Bile cpds.—This term includes bile acids, 
esters, and amides. The compounds are generally 
recognized by the presence of a chain of 3 to5 
carbons in the 17 position. 

l. Acids.—This term includes the bile acids, 
salts, amides, and esters. 

2. Cholanic.—The cholanic group. 

3. Norcholanic.—The norcholanic group. 

4. Bisnorcholanic.—The bisnorcholanic group. 


Example of coding: 
(a cholanic acid) 


(ere! 


— C-C - COOH 


Figure 26 


The underlined portion is coded as follows: 


Codes: Column 64 Row 0 (bile compounds) 
Column 64 Row 1 (bile compounds, acids) 
Column 64 Row 2 (bile compounds, 

cholanic) 


Column 65 


0. Sterols.—Sterols containing a hydrocarbon of 
more than 5 carbons in 17 position. The 2A hydro- 
carbon term does not apply to the sterol hydrocar- 
bon side chain. 

1. Ergosterol.—The ergosterol nucleus. 

2. Cholesterol.—The cholesterol nucleus. 

3. Vitamin Dg3.--Vitamin D3 has been included 
although it is not generally considered to be a 
steroid. ; 


Column 66 


0. Hal.—A member of the halogen group. 

1. Fl,—Fluorine. 

2. Br.— Bromine. 

3. I,—lodine 

4. Cl.—Chlorine 

5-11. Double bonds.—These terms provide 2 
further delineation of the 2A double bond term, both 
broadly and specifically. 


Column 67 


1. Androstane.—See supra, page 8. 

2. Addition Compound.—This term designates 
addition compounds such as amine salts, bisulfate 
addition products, etc. 


= Wh 


OOO 
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3. Maleic adducts.—Steroid reaction products of 
maleic acid anhydrides or esters. 

4. Pregnane.—See supra, page 8. 

5. 21 Unsubstituted.— Pregnanes which are not 
substituted in positional location 21. 

6. 21 diazo.—Steroids containing a diazo group 
in positional location 21. 

9. Misc.—Any general term not specifically pro- 
vided for. 


CODING PROCEDURE 


Instructions for Coding 


1. Read the document for comprehension of the 
subject matter. 

2. Encircle all of the codes representing both the 
2A and 2B terms found in the patent on the coding 
form (Figures 2 and 3) in accordance with the 
principles of multiple coding and composite coding. 

3. Extract all pertinent terms disclosed in the 
patent. The title, text, and claims of the patent 
are all parts of the disclosure for extraction and 
coding purposes. Chemical configurations dis- 
closed as possible substituents as well as those 
more specifically disclosed are extracted and 
coded. 

4. Have the coded information verified by a 
second individual. Note that the coded form 
represents a composite of all the substituents 
disclosed for the steroid nucleus in the given 
patent. 


Example of Coding 


The compound shown in Figure 27 is coded below. 
Applicable 2A and 2B terms are encircled on the 
coding forms presented in Figures 28 and 29, re- 
spectively. See pages 16 and 17. 


Figure 27 


ALTERNATIVE CODING PROCEDURE—NON- 
COMPOSITE METHOD 


A modification of the coding system described 
Ve can be employed when it is desired to pro- 


vide machine selection and discrimination on an 
individual compound basis. This modification is 
in contrast with the method of composite coding 
(supra, page 5). 

In individual compound coding, the 2A sub- 
stituents and 2B terms disclosed for the particular 
compound are coded in the usual manner. The 
absence of 2A terms in the remaining positional 
locations on the steroid nucleus is indicated by 
employing the “H” descriptor for each of such 
locations. The only exception is in the 17 position, 
in which the punching of an “H” signifies the 
presence of only one substituent instead of two. 
When the keto group is in positional location 
17, no “H” is punched. This device is riot used 
for positional locations 20, 21 and 22. 

To find those compounds which do not have 
double bonds in certain positions, the absence of a 
double bond is asked for by an appropriate wiring 
modification. 

An illustration of individual compound coding is 
shown below for the compound progesterone which 
has the formula given in Figure 30. 


P 
c=o 
oO 
Figure 30 
Codes: Column 1 Row 4 
Column 15 Rows 1, 2, 4, 5, 6, 7, 8, 9 
Column 16 Rows 1, 2, 4, 5, 6, 7, 8, 9 


Column 21 Row 3, 
Column 22 Row 11 
Column 67 Rows 4, 5 


PUNCHED-CARD FORMAT 


The punched-card format is designed to ac- 
commodate steroid disclosures in both patents and 
published literature, domestic and foreign. The 
format is the same in all cards for columns l 


through 70 as follows (supra, page 5): 


a. Columns 1-48 : 2A terms 
b. Columns 60-67 : 2B terms 
c. Columns 49-59, 68, 69: blank 


For patent disclosures, the remaining ten col- 
umns are allocated as follows: 

: code signifying the 
Class 260 subclass in 
which the patent is clas- 
sified; see the Appen- 
dix, Table-I for particu- 
lar codes. 


a. Column 70 


-15- 


Miscellaneous 


es 123456789 
3 


10 
4 


11 


12 


Patent NO... ceccccccce cco ce XAMINEY coccccccccs Mae UunCheCicicicje\ciejeleicie cieeeienion 


724123456789 


Figure 28 
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Xor allo 22 123456789 10 11 12 13 14 15 16 17 18 19 20 21 
-c= 123456789 10 11 12 13 14 15 16 17 18 19 20 21 
CH3 ca 12 i2ssse7as 10.11 12 13 14 15 16 17 18 19 20 a4 
CN 5 123456789 10 11 12 13 14 15 16 17 18 19 20 21 
COOH or COOR 2 123456789 10_11 12 13 14 15 16 17 18 19 20 21 
-C-sub EES EY 10 11 12 13 14 15 16 17 18 19 20 21 
-H 22123456789 10 11 12 13 14 15 16 17 18 19 20 21 
17 KE 
NH» or NO yai23456789 10 11 12 13 14 15 16 17 18 19 20 21 
OH sg125456789 101112 19 14 15 16 17 38 19 20.21 
=0 ag 129456789 1011 32 39 34 35 16 17 18 19 @ 21 
Se- TAQ AROUBE 10 11 12 13 14 15 16 17 18 19 20 21 
-S-R 2 129456789 30 11 12 15 14 15 16 17 16 19 20 1 
Hal Pee leiees Gott 2 18 14 15 16 17 18 19 20 
Hydrocarbon 123456789 10 11 12 13 14 15 16 17 18 19 20 21 
Ketal 12 12@456789 10 11 12 13 14 15 16 17 18 19 20 21 
Ketone reagent 22123456789 Os 12 13 14 15 16 17 18 19 20 21 
Epoxy LA SAB OO 10 11 12 13 14 15 16 17 18 19 20 21 
-O hydrocarbon wy AD ASD 1011 12 13 14 15 16 17 18 19 20 21 
-O Acyl a2 7409 ORO 10 Q) 12 13 14 15 16 17 18 19 20 21 
-O-hetero aD 45 OO 10 11 12 13 14 15 16 17 18 19 20 21 
=N-hetero AAG OUOE 10 11 12 13 14 @5 16 17 18 19 20 21 
S-hetero # fo3456789 10 11 12 13 14 15 16 17 18 19 20 21 


60 61 62 
©) O-Acyl 0) O-Hetero ©) N-Hetero S-Hetero 
@ Carboxylic Morpholine Morpholine Thiophene 
2 Poly Furan Piperidine Thiazole 
3 Unsat. Lactone Pyridine 

Aromatic Spirostane Pyrimidine 


4 
® Aliphatic Sub in O spiro Pyrrole 
ring 


Substi. Psuedosapo. Thiazole 
St. chain 
Cycloalkyl 
Branched 
Heterocyclic 


Inorganic 
except hal 


0 Bile cpds. 0 Sterols 
Acids Ergosterol 1 Androstane 
Cholanic Cholesterol ' 2 Add. compd.. | 


Norcholanic Vitamin D3 3 Maleic adduct 


Bisnorcholanic @® ci. @® Pregnane 
Double bond 


5 (6) 21 Unsubsti 
tuted 

5 (10) 21 Diazo 

8 (9) 

8 (14) 

1 (2) 

1(10) 


Figure 29 
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LE Se a 


b. Column 72 


c. Columns 74-80 
d. Columns 71, 78 


: code designating the 


country of patent origin; 
see the Appendix, Ta- 
ble Il for particular 
codes. 


; patent number. 
: blartk 


For published literature, the remaining ten col- 
umns are allocated as follows: 


a. Column 71 


b. Column 72 


c. Column 73 


d. Column 74 


e. Columns 75-78 


f. Colunns 79, 80 


: code representing the 


journal. 


: code designating the 


country of journal ori- 
gin; see the Appendix, 
Table Ill for the com- 
bined journal—country 


codes. 


: year of journal; see the 


Appendix, Table IV for 
particular codes. 


: month or volume of 


journal, the month tak- 
ing precedence. The 
volume is anarbitrarily 
assigned number-—1 for 
1958, 2 for 1959, etc. 


: page numbers of journal 


article. 


: unique arbitrarily as- 


signed number to each 
compound coded in the 
journal article. 


USE IN PATENT SEARCHING 


Use of the above-described system by Mecha- 
nized Examining Division A in examining steroid 
patent applications is outside the scope of this 
manual. Briefly, however, search questions are 
formulated based upon the patent applications ina 
manner similar to the method of coding the patents. 
Distinctions arise in the formulations based upon 
various factors including knowledge of the art and 
examination practice and procedures. 
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Appendix 


Table I Table I 
CODES DESIGNATING SUBCLASS ORIGIN CODES DESIGNATING NATIONAL ORIGIN 
(Punch in Column 70) (Punch in Column 72)* 
Subclass Row* if 
239,5 1 
239.55 2 S Denmark 
239.57 3 F France 
397 4 G Germany 
397.1 5 B Great Britain 
397.2 6 I Italy 
397.25 7 J Japan 
397.3 8 R Russia and the USSR 
397.4 9 H Switzerland 
397.45 0 U United States 
397.47 11 
397.5 12 *No notation has been made in column 72 of the 


present punch-card deck to identify United States 


*Code is the same as the number of a particular patents 


row in column 70 since twelve subclasses are the 
subject of the puched-card deck. 


Table Il 


CODES IDENTIFYING JOURNAL AND NATIONAL ORIGIN 
(Punch in Columns 71 and 72) 


Journal name, origin 


(o{9) 


Acta Chemica Scandinavica (Denmark) 

Acta Endocrinologica (Denmark) 

American Journal of Physiology (United States) 

Angewandte Chemie (Germany) 

Annales d'endocrinologie (France) 

Annales der Chemie, Justus Liebigs. (Germany) 

Biochemical Journal (Great Britain) 

British Medical Bulletin (Great Britain) 

Bulletin de la societe chimique de France (France) 

Chemische Berichte (Germany) 

Chemistry and Industry (Great Britain) 

Comptes rendus des seances de la societe de biologie et de ses filiales 
(France) 

Current Chemical Papers (Great Britain) 

Doklady Akademii Nauk Soyuza Sovetskikh Sotsialisticheskikh Respublik 
(Russia) 

Experientia (Switzerland) 

Gazzetta chimica italiana (Italy) 

Helvetica Chimica Acta (Switzerland) 

Izvestiya Akademii Nauk Soyuza Sovetskikh Sotsialisticheskikh Respublik. 
Otdelenie Khimicheskikh Nauk (Classe des sciences chimiques) (Russia) 


PF=BQOZFBOMAV>SO 


FIs T PO BYE ABBQOMNACHUY 


—AQh Oi 


=< 


Table III.—CODES IDENTIFYING JOURNAL AND NATIONAL ORIGIN—Con. 
(Punch in Columns 71 and 72) 


Journal name, origin 


(Ie 


Journal of Biochemistry (Japan) 

Journal of Biological Chemistry (United States) 

Journal of Organic Chemistry (United States) 

Journal of the American Chemical Society (United States) 

Journal of the Chemical Society (Great Britain) 

Journal of the Chemical Society of Japan (Japan) 

Journal of Clinical Endocrinology and Metabolism (United States) 

Nature (Great Britain) 

Naturwissenschaften (Germany) 

Pharmaceutical Bulletin (Japan) 

Proceedings of the Society for Experimental Biology and Medicine (United 
States) 

Science (United States) 

Ukrainskii Khimicheskii Zhurnal (Russia) 

Voprosy Med. Khimii (Russia) 

Zeitschrift fur physiologische Chemie (Hoppe-Seyler's) (Germany) 

Zhurnal Obshchei Khimii (Russia) 


OVTZCY ZvZzZzmanrnr0oww 
PAARZDAC Cm-Awc=wswecc 


Table IV 


CODES DESIGNATING YEAR OF JOURNAL 
(Punch in Column 73) 


CONAN EWNHHO 


- 20 - 43386--U.S.Dept.of Comm--D0--1958- 


